$$-extension Hidden Markov Models and Weighted Transducers for Machine Transliteration
نویسندگان
چکیده
We describe in detail a method for transliterating an English string to a foreign language string evaluated on five different languages, including Tamil, Hindi, Russian, Chinese, and Kannada. Our method involves deriving substring alignments from the training data and learning a weighted finite state transducer from these alignments. We define an ǫ-extension Hidden Markov Model to derive alignments between training pairs and a heuristic to extract the substring alignments. Our method involves only two tunable parameters that can be optimized on held-out data.
منابع مشابه
Markovian Models for Sequential Data
Hidden Markov Models HMMs are statistical models of sequential data that have been used successfully in many machine learning applications especially for speech recognition Further more in the last few years many new and promising probabilistic models related to HMMs have been proposed We rst summarize the basics of HMMs and then review several recent related learning algorithms and extensions ...
متن کاملTransliteration System Using Pair HMM with Weighted FSTs
This paper presents a transliteration system based on pair Hidden Markov Model (pair HMM) training and Weighted Finite State Transducer (WFST) techniques. Parameters used by WFSTs for transliteration generation are learned from a pair HMM. Parameters from pair-HMM training on English-Russian data sets are found to give better transliteration quality than parameters trained for WFSTs for corresp...
متن کاملMAN-MACHINE INTERACTION SYSTEM FOR SUBJECT INDEPENDENT SIGN LANGUAGE RECOGNITION USING FUZZY HIDDEN MARKOV MODEL
Sign language recognition has spawned more and more interest in human–computer interaction society. The major challenge that SLR recognition faces now is developing methods that will scale well with increasing vocabulary size with a limited set of training data for the signer independent application. The automatic SLR based on hidden Markov models (HMMs) is very sensitive to gesture's shape inf...
متن کاملHidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers
Automatic earthquake detection and classification is required for efficient analysis of large seismic datasets. Such techniques are particularly important now because access to measures of ground motion is nearly unlimited and the target waveforms (earthquakes) are often hard to detect and classify. Here, we propose to use models from speech synthesis which extend the double stochastic models f...
متن کاملContinuous Speech Recognition Based on Deterministic Finite Automata Machine using Utterance and Pitch Verification
This paper introduces a set of acoustic modeling techniques for utterance verification (UV) based continuous speech recognition (CSR). Utterance verification in this work implies the ability to determine when portions of a hypothesized word string correspond to incorrectly decoded vocabulary words or out-of-vocabulary words that may appear in an utterance. This capability is implemented here as...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009